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1  Abstract 


The  objective  of  this  work  is  the  development  of  efficient  signal  processing  techniques  for  the  de¬ 
tection  and  classification  of  military  munitions  in  shallow  underwater  environments  using  data 
collected  from  synthetic  aperture  sonar  (SAS)  systems.  In  this  final  report  we  first  address  the 
problem  of  detecting  the  presence  of  underwater  munitions  using  the  multichannel  coherence  anal¬ 
ysis  framework.  Our  detection  hypothesis  is  that  the  presence  of  munitions  in  the  sonar  backscatter 
collected  from  a  hydrophone  array  will  lead  to  higher  levels  of  coherence  compared  to  the  backscat¬ 
ter  from  the  seafloor  alone.  This  method  has  been  found  to  produce  excellent  detection  performance 
on  other  sonar  datasets.  Here,  detection  results  are  presented  on  a  sonar  dataset  which  was  col¬ 
lected  in  a  relatively  controlled  and  clutter-free  environment.  Results  are  presented  using  standard 
performance  metrics  such  as  probability  of  detection  (Pd),  probability  of  false  alarm  (. Pfa ),  and 
Receiver  Operating  Characteristic  (ROC)  curve  characteristics. 

The  goal  of  the  second  part  of  this  work  is  to  develop  a  robust  target  classification  method 
that  can  be  applied  to  the  detected  contacts  to  discriminate  munitions  from  non-hostile  man  made 
objects  and  competing  clutter.  This  method  is  developed  based  upon  the  Matched  Subspace  Classi¬ 
fier  (MSC)  using  multidimensional  Acoustic  Color  (AC)  data  extracted  from  the  raw  sonar  returns. 
Scattering  models  developed  by  APL-UW  were  acquired  to  generate  the  required  training  dataset 
for  various  UXO  and  non-UXO  objects.  This  was  done  owing  to  the  fact  that  actual  sonar  data 
from  a  wide  range  of  UXO  and  non-UXO  objects  is  scarce  in  realistic  situations.  Although,  it  may 
be  somewhat  ambitious  to  expect  model  data  capture  all  the  essential  features  of  these  objects  for 
target  characterization,  it  will  provide  us  with  clues  on  how  to  augment  the  training  datasets  using 
perhaps  a  limited  training  samples  from  sonar  returns  of  actual  objects  to  improve  the  robustness 
in  different  environmental  conditions.  Our  classification  hypothesis  is  that  spectral  content  of  the 
sonar  backscatter  display  unique  acoustic  signatures  providing  excellent  discrimination  between 
different  classes  of  detected  objects.  Classification  results  of  the  MSC  using  three  different  signal 
subspace  construction  methods  are  presented  on  three  real  sonar  datasets.  The  first  two  sonar 
datasets,  PondEX09  and  PondEXlO,  were  collected  for  various  underwater  objects  using  a  rail  sys¬ 
tem  in  a  pond  under  relatively  controlled  and  clutter-free  conditions.  The  third  dataset,  TREX13, 
was  also  collected  using  the  same  rail  system  but  in  the  bay  area  off  of  the  Panama  City  coast 
where  other  factors  such  as  school  of  fish,  water  turbulence,  seafloor  roughness,  and  target  range 
were  more  realistic.  Results  are  presented  using  standard  performance  metrics  such  as  probability 
of  correct  classification  (Pec),  probability  of  false  alarm  (Pfa),  ROC  curve,  and  confusion  matrix 
of  the  classifier. 

2  Objective 

The  objective  of  this  work  is  the  development  of  automatic  target  recognition  (ATR)  algorithms 
for  the  detection  and  classification  of  military  munitions  in  shallow  underwater  environments  using 
data  collected  from  SAS  systems.  Specifically,  one  of  the  technical  questions  that  will  be  answered 
in  this  work  is  whether  or  not  multichannel  detection  [1]  using  the  Generalized  Likelihood  Ratio  Test 
(GLRT)  can  detect  munitions  with  sufficiently  high  probability  of  detection  and  localization  while 
at  the  same  time  providing  high  false  alarm  rejection  capability.  For  this  problem,  the  hypothesis  is 
that  the  presence  of  munitions  in  the  sonar  backscatter  collected  from  a  hydrophone  array  will  lead 
to  higher  levels  of  coherence  compared  to  the  backscatter  from  the  seafloor  alone.  This  increase 
in  coherence  can  give  one  an  indication  of  which  areas  of  the  held  may  contain  possible  munitions 
and  subsequent  classification  and  further  analysis  may  then  be  conducted  in  those  areas.  A  new 
version  of  this  detector  which  looks  for  high  coherence  in  two  frequency  bands  of  the  received  data 
was  developed  and  tested  in  this  study.  As  the  detector  must  be  applied  to  the  entire  target  held, 
it  is  important  that  the  developed  methods  are  computationally  efficient  as  well  as  providing  high 
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detection  rates  at  a  sufficiently  low  false  alarm  rate.  In  this  report,  we  present  detection  results 
of  the  new  version  of  the  multichannel  detection  technique  when  applied  to  the  PondEXlO  dataset 
consisting  of  sonar  backscatter  from  multiple  munitions. 

The  specific  objective  of  the  second  part  of  our  work  is  the  development  of  efficient  methods 
for  the  classification  of  military  munitions  in  shallow  underwater  environments  using  data  collected 
from  SAS  systems.  Specifically,  the  technical  question  that  is  addressed  in  this  work  is  whether  or 
not  the  Matched  Subspace  Classifier  (MSC)  [2]  can  successfully  be  trained  on  model-based  sonar 
data  of  various  UXO-like  objects  and  then  be  applied  to  real  sonar  datasets  to  discriminate  muni¬ 
tions  with  sufficient  accuracy  from  other  natural  or  man-made  competing  objects.  The  motivation 
here  is  related  to  the  fact  that  collecting  real  UXO  data  in  realistic  settings  is  a  difficult,  if  not 
impossible,  task.  Thus,  by  using  physical  models  to  construct  a  signal  subspace  spanned  by  the 
acoustic  response  of  a  particular  target  over  a  range  of  aspect  orientations,  our  hypothesis  is  that 
the  matched  subspace  classifier  in  [2]  can  provide  an  effective  classification  method  that  remains 
robust  to  changes  in  target  aspect.  Moreover,  by  keying  in  on  specific  target  responses,  the  clas¬ 
sifier  will  correspondingly  exhibit  low  false  alarm  rates.  The  development  of  systems  that  can  be 
trained  on  model-based  data  with  guaranteed  performance  on  real  data  could  provide  a  significant 
contribution  toward  solving  this  difficult  problem.  Throughout  the  course  of  this  research,  proof- 
of-concept  of  the  proposed  methods  will  be  provided  by  applying  the  trained  classifier  to  one  or 
more  representative  real  datasets  for  evaluation  purposes.  These  include,  but  are  not  limited  to, 
multiple  datasets  from  the  PondEX09  and  PondEXlO  exercises  as  well  as  a  more  recently  acquired 
dataset  referred  to  as  TREX13. 

The  comprehensive  testing  of  the  MSC  and  its  application  to  the  classification  of  munitions 
using  SAS  systems  is  another  primary  objective  of  this  work.  To  this  end,  we  test  the  hypothesis 
that  the  spectral  features  captured  in  the  AC  data,  extracted  from  the  sonar  backscatter  from 
various  objects,  display  unique  features  providing  excellent  discrimination  between  different  classes 
of  detected  objects.  In  this  final  report,  we  will  present  new  classification  results  of  the  MSC  when 
trained  via  K-SVD  [3]  and  LP-KSVD  [4]  methods  on  synthetic  AC  data  that  has  been  generated 
via  a  fast  ray  model.  These  classifier  systems  are  then  applied  to  the  entire  PondEX  and  TREX13 
datasets  to  test  the  generalization  ability  of  the  classifier. 

3  Background 

The  Department  of  Defense  (DoD)  is  currently  responsible  for  clearing  many  sites  which  are  poten¬ 
tially  contaminated  with  munitions  as  a  result  of  past  training  and  weapons  testing  activities.  In 
many  cases,  these  activities  occurred  near  or  were  performed  in  shallow  water  environments  where 
munitions  pose  serious  threats  to  public  safety  and  the  environment.  The  ultimate  goal  of  this 
work  is  to  provide  the  DoD  with  improved  detection  and  characterization  techniques  as  it  strives 
to  find  safer  and  more  cost-effective  technologies  for  underwater  munitions  remediation.  This  seed 
research  responds  to  SERDP  SON  MRSEED-14-01  in  Wide  Area  and  Detailed  Surveys  for  rapid 
and  highly  efficient  detection  and  classification  of  underwater  UXOs  found  in  contaminated  sites. 

Various  methods  have  also  been  developed  for  modeling  the  acoustic  response  of  objects  with 
geometries  typically  observed  in  mine  and  UXO-hunting  applications  and  using  this  information 
for  the  purposes  of  classification.  In  [5],  the  authors  considered  SAS  imaging  of  simple  targets  by 
combining  models  for  reverberation,  acoustic  penetration,  and  target  scattering  into  a  unified  model. 
This  is  then  used  to  generate  pings  suitable  for  SAS  simulations  over  a  range  of  environmental 
and  experimental  conditions.  Experimentally  measured  target  scattering  from  proud  and  buried 
targets  are  then  used  to  validate  the  model  through  several  simulations.  In  [6],  the  authors  analyze 
experimental  results  from  a  SAS  data  set  collected  in  a  fresh  water  pond.  These  measurements  were 
conducted  to  investigate  discrimination  capabilities  based  on  the  acoustic  response  of  targets  for 
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underwater  UXO  applications.  Results  from  this  study  show  that  it  is  possible  to  use  the  acoustic 
template  as  a  fingerprint  to  uniquely  identify  a  given  target.  In  [7],  it  was  further  shown  that 
these  acoustic-color  (AC)  features  are  useful  for  discriminating  similarly  shaped  targets.  In  [8]  and 
[9],  the  PI  and  his  team  developed  new  coherent-based  feature  extraction  and  SAS-like  acoustic 
color  for  detection  and  classification  of  underwater  mines  and  UXO  objects  using  the  Canonical 
Coordinate  Analysis  (CCA)  framework.  New  feature  and  decision-level  fusion  algorithms  were  also 
developed  in  [10]  and  [11]  using  a  hidden  Markov  model  (HMM)  and  a  Collaborative  Multi- Aspect 
Classifier  (CMAC)  to  improve  classification  of  mine-like  objects  while  reducing  the  false  alarms 
when  multi-pings/aspects  sonar  data  are  available. 


4  Materials  and  Methods 

In  this  section,  we  first  give  a  brief  review  of  the  theory  that  motivates  the  detection  algorithm 
studied  in  this  project.  More  specifically,  we  will  begin  by  reviewing  the  theory  of  multichannel 
detection  using  broadband  coherence  [1],  [12].  A  discussion  on  the  fast  ray  model  [13],  [14]  used 
in  generating  synthetic  AC  and  SAS  datasets  as  well  as  a  review  of  the  theory  that  motivates  the 
classification  algorithm  studied  in  this  project  are  then  provided.  This  includes  the  general  theory 
of  the  MSC  classifier  [2]  as  well  as  the  two  subspace  learning  methods,  K-SVD  and  LP-KSVD, 
which  were  utilized  in  training  the  signal  subspaces  for  the  MSC. 


4.1  Detection  Using  Broadband  Coherence 


In  this  section,  we  give  a  brief  review  of  the  theory  of  multichannel  coherence  [1],  [12]  and  discuss  the 
technical  approach  used  to  apply  this  theory  for  the  detection  of  underwater  munitions.  Consider 
a  set  of  L  random  data  matrices  {Xi}f=1  with  each  matrix 


= 


a;*  [0,0] 
a;*  [1,0] 


Xi[  0, 1] 

Xi[  1, 1] 


Xi[M  -  1,0]  Xi[M  -  1,1] 


Xi[0,N  -  1] 

xi\\,  N  —  1] 

Xi[M  -  1,  AT  -  1] 


€  C 


MxN 


(1) 


representing  a  2-dimensional,  zero-mean  random  process  captured  at  sonar  platform  i.  In  this 
particular  application,  the  random  variable  Xi[m,n]  represents  the  rfi1  temporal  sample  collected 
from  the  mth  hydrophone  element  in  an  array  from  the  frequency  band.  Stacking  the  columns 
of  Xj  to  form  the  vector  x*  =  vec(Xj),  the  composite  vector  z  =  [x^ 
matrix 


T  has  covariance 


Rii 

Rl  2  •  ■ 

•  Rl  L 

R  =  E  [zzH] 

= 

R?2 

i?22  • ' 

■  R2L 

e  0 LMNxLMN 

_ 1 

R?l  •• 

•  Rll 

U[x.txf]  € 

£MNxMN 

.  This 

matrix  captures  all  space-time 

with  Rik  =  Rj*t 
information  within  and  between  the  random  vectors  {xj}^_1. 

We  now  assume  we  are  given  an  experiment  producing  P  iid  realizations  {xj[p]}^=1  of  the 

t  h 

random  vector  associated  with  the  iul  frequency  band.  In  the  contexts  of  this  problem,  these  P 
independent  copies  represent  multiple  pings  collected  from  each  frequency  band.  All  P  realizations 
may  then  be  used  to  form  the  data  matrix 


z  =  mi]  •  •  •  = 


xi[!]  ' 

••  Xi[P] 

xl[1]  • 

••  *l[P] 

e  C 


LNMxP 
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where  z \p\  =  [xf[p]  •  •  -x^[p]]T.  Using  the  Generalized  Likelihood  Ratio  Test  (GLRT)  to  test  the 
null  hypothesis  that  R  =  blkdiag  {/?n, . . . ,  Rll}  involves  forming  an  estimate  of  the  composite 
covariance  matrix 


R=-ZZ 


H 


^zb]zHb] 

p=  1 


and  computing  the  following  likelihood  ratio  [1] 


.  det  R 

A  = - x- 

det  D 


R11 

R12 

R?2 

R22 

1 

?a> 

det/? 

n,=l  det  R,t 


R\l 

R2L 

Rll  . 


(3) 


Finally,  assuming  that  all  L  data  sequences  are  jointly  2-dimensional  WSS  and  using  results  on 
determinants  of  asymptotically  large  block-Toeplitz  matrices,  similar  arguments  as  that  given  in  [12] 
shows  that,  as  M,  N  — >  00,  the  likelihood  ratio  given  in  (3)  can  be  extended  to  the  2-dimensional 
frequency  domain 


A  mjv  —>.  exp 
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(4) 


with  S(e^0,  e^),  —ir  <  6  <  n,  —it  <  cj)  <  it 
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(5) 


an  estimated  composite  power  spectral  density  matrix.  Here,  Sik(e^0 ,  e^)  represents  a  quadratic 
estimate  of  the  cross-power  spectrum  between  channels  i  and  k  at  frequency  6  and  wavenumber  <f>. 
Thus,  the  likelihood  ratio  becomes  a  frequency /wavenumber  dependent  Hadamard  ratio  integrated 
over  the  Nyquist  band.  This  likelihood  ratio  can  then  be  compared  to  a  threshold  (determined 
experimentally  based  upon  some  training  data)  to  decide  whether  or  not  UXO’s  are  present.  More¬ 
over,  the  test  statistic  given  in  (4)  is  computationally  efficient  as  the  estimated  cross-spectral  matrix 
in  (5)  can  be  computed  using  a  Fast  Fourier  Transform  (FFT)  of  the  waveforms  received  by  each 
channel.  More  specifically,  if  we  let  X^{e?0 ,  e^)  €  C  represent  the  two  dimensional  FFT  of  the 
realization  of  the  data  matrix  given  in  (1)  at  frequency  0  and  wavenumber  (f>,  then  the  cross-power 
spectrum  may  be  computed  as 


Sik(ej6,e^) 


(6) 


4.2  Scattering  Models  Utilized  in  Classifier  Development 

In  order  to  tackle  the  task  of  constructing  template  signals  to  reliably  represent  the  various  UXOs 
in  our  MSC  system,  the  work  of  the  Co-PI  on  modeling  scattering  from  objects  at  a  water-sediment 
interface  has  been  utilized  [13]- [15].  The  scattering  model  allows  for  monostatic  SAS  data  sets  to 
be  simulated  via  a  fast  ray  model  that  combines  an  acoustic  ray  approximation  for  propagation  in 
a  fluid-filled  halfspace  with  scattering  from  a  target  in  a  number  of  conditions  and  media.  This  fast 
modeling  is  beneficial  for  generating  large  datasets  for  the  MSC  signal  subspace  construction.  In 
this  section  we  will  discuss  this  scattering  model  as  well  as  its  utility  in  our  classifier’s  development. 
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Figure  1:  The  path  1  is  a  direct  path.  The  paths  2  and  3  interact  with  the  sediment  once  and 
scatter  from  the  target  in  a  bistatic  direction.  Path  4  is  a  back-scattering  path  with  two  bottom 
interactions. 


Under  typical  operation  for  a  short-range  SAS  platform,  air-water  scattering  paths  can  be 
ignored,  because  paths  that  interact  with  the  air- water  interface  are  either  removed  by  time-gating 
the  received  signals  or  are  naturally  suppressed  by  the  directivity  of  the  source  and  receiver.  In 
addition,  the  separation  distance  between  the  actual  source  and  receiver  is  much  smaller  than  the 
distance  between  the  interface  and  the  target,  so  the  source  and  receiver  can  be  considered  to  be 
co- located.  Under  these  conditions,  only  the  four  ray  paths  that  are  shown  in  Figure  1,  associated 
with  the  actual  source  and  receiver  and  their  images  in  the  sediment,  contribute  to  the  scattered 
pressure.  In  this  model,  a  source,  receiver,  and  target  are  at  locations  rs,  rr,  and  rt  respectively 
and  a  single  image  source  is  located  at  rSi  with  image  receiver  at  rn.  The  source,  receiver,  and 
target  are  denoted  by  S,  R,  and  T,  respectively;  while  ST  and  Rl  in  Fig.  1  are  the  image  source 
and  image  receiver.  To  distinguish  path  2  and  path  3,  the  source  and  receiver  are  shown  at  distinct 
locations;  and  with  our  assumption  of  co-located  source  and  receiver,  paths  2  and  3  are  reciprocal 
and  paths  1  and  4  are  back-scattered.  With  the  specification  of  an  image  source  and  image  receiver, 
the  scattering  from  a  target  has  been  reduced  to  a  superposition  of  4  free-held  scattering  problems. 
Under  operational  conditions,  the  distance  associated  with  each  path  satisfies  A  where  A  is  the 
wavelength  of  the  pressure.  The  scattered  steady-state  pressure  can  then  be  written  as 

.  ,,  ,  .  exp (ikr) 

Ps  =  PoA  (k,;,  ks,w) -  (7) 

r 

where  po  is  the  amplitude  of  the  incident  pressure,  A  is  the  scattering  amplitude,  k;  and  ks  are  unit 
vectors  associated  with  the  direction  of  the  incident  and  scattered  fields,  u  is  the  angular  frequency, 
r  is  the  range  from  the  target  to  a  held  point  in  a  target-centered  coordinate  system,  and  exp (ikr)/r 
is  a  spherically  diverging  wave,  respectively.  The  scattering  amplitude  contains  all  the  information 
concerning  the  material  properties  of  the  target  (e.g.,  density)  and  the  directionality  of  the  scattered 
held.  A  scattering  amplitude  can  be  determined  from  an  analytic  solution  to  a  scattering  problem 
(e.g.,  scattering  from  a  spherical  target),  direct  measurement  from  an  actual  target,  or  numerical 
simulation  (e.g.,  a  finite  element  (FE)  model  for  a  given  target  as  was  used  in  Figures  4(c)). 


Incident  plane  wave  Observation  point 


Figure  2:  Free-held  scattering  assumes  a  portion  of  an  incident  wave  is  scattered  to  a  distant 
observation.  Here,  a  represents  a  characteristic  dimension  of  the  target. 


Combining  the  ray  model  paradigm  with  free-held  scattering  as  given  in  (7),  the  spectrum  of 


the  total  scattered  pressure  can  be  written  as 


P(u) 


^•1  (^0  riu)t\  |  V(dg)Mu)  icut2  I  V(0g)Mu)  iwts  _j_  20^4 1  p  / 

77e_r  77  e  i  77  c  i  77  o  r  Qjr  srcyw 

aitt2  c/2^3  d\d/±  d%d/± 


(8) 


with  di  =  |rs  —  rf|,  c?2  =  |r*  —  rr|,  cfa  =  |rSi  —  rt|,  and  dt  =  |rt  —  rrJ.  The  time  delays  are 
then  t\  =  (d\  +  d%)/c,  t?  =  (cfo  +  d$)/c,  t-,$  =  (d\  +  d^/c,  and  £4  =  (cfe  +  di)/c;  with  c  being 
the  speed  of  sound  in  water.  The  pressure  spectrum  Psrc(u)  represents  the  frequency  spectrum  of 
the  transmitted  wave  packet  from  the  source,  and  ro  is  a  reference  distance  associated  with  the 
calibration  of  the  source  (typically,  ro  =  1  nr).  The  scattering  amplitudes  Ak(u)  in  (8)  depend  on 
the  locations  of  the  sources,  receivers,  and  target.  Note  the  indices  of  Af.  correspond  to  the  path 
enumeration  described  before.  The  reflection  coefficient  at  the  water-sediment  interface,  which  is 
represented  by  V(9g)  and  is  a  function  of  the  grazing  angle  9g ,  is  defined  as  follows 


v(0  \  =  psin(flg)  —  (k2  —  cos2(0g))1/2 
9  psm(9g)  +  (k2  —  cos2(9g))1/2 

where  p  =  P2I  P\  and  k  =  (1  +  v5)/v  with  u  =  C2/C1.  Here  04  and  p\  are  the  speed  of  sound  and 
density  for  the  water,  and  02,  P2,  and  5  are  the  speed  of  sound,  density,  and  loss  parameter  for  the 
sediment.  An  inverse  Fourier  transform  of  P(u>)  thus  gives  a  generated  sonar  signal  that  includes 
the  four  primary  acoustic  paths  for  a  target  near  an  interface. 


4.3  AC  Data  Generation 

In  order  to  create  synthetic  AC  data  for  classifier  training,  raw  sonar  returns  generated  by  the 
above  method  were  processed.  Generation  of  AC  data  amounts  to  forming  the  intensity  of  the 
returned  spectral  power  over  the  entire  range  of  aspect  angles  that  are  modeled  in  a  linear  path 
SAS  (LSAS)  run.  This  is  accomplished  by  the  following  pi'ocedure:  First,  Finite  Element  (FE) 
model  [15]  is  implemented  to  produce  scattering  amplitude  signal  for  an  intended  target  or  non¬ 
target  objects.  These  scattering  amplitudes  are  modeled  for  acoustic  transmissions  and  returns  in 
the  frequency  range  of  0-31  kHz.  Next,  the  half-space  model  including  the  four  described  ray  paths 
in  (8)  is  utilized  to  generate  a  raw  sonar  return  dataset  by  generating  the  modeled  returns  of  a 
target  using  the  IFFT  of  (8)  over  a  pre-generated  coordinate  set  representing  the  various  positions 
along  a  linear  path  making  soundings.  In  this  simulation  model  (as  in  the  PondEX09-10  and  TREX 
data)  an  LFM  incident  signal  is  used  which  provides  spectral  information  of  an  object’s  response 
in  the  range  of  0-31  kHz.  Next,  these  raw  soundings  are  matched  filtered  (pulse  compressed)  with 
the  original  transmit  signal.  Then  the  FFT  is  taken  of  these  pulse  compressed  soundings.  The  new 
series  is  windowed  to  0-31  kHz  to  remove  the  unused  frequency  portions  and  isolate  the  frequency 
range  of  interest.  Finally,  the  aspect  dimension  of  the  AC  is  determined  as  a  function  of  the  path 
and  target  locations  and  the  corresponding  aspect  of  each  transformed  pulse-compressed  sounding 
is  plotted  against  the  frequency  axis  with  a  cool-hot  color  mapping  that  represents  spectral  energy 
intensity  for  a  given  frequency  and  aspect.  An  example  of  AC  data  generated  for  a  21  m  simulated 
LSAS  run  with  a  target  at  a  10  m  range  and  with  the  source/receiver  elevation  3.8  m  above  the 
water-sedinrent  interface  is  given  in  Figure  3. 

As  can  be  seen  from  this  figure,  the  observed  aspect  angles  are  limited  to  ~  86°  corresponding 
to  the  limited  aspect  orientations  that  an  observing  sonar  interface  experiences  in  a  21  m  linear 
path  sounding  an  object  at  a  range  of  10  m.  As  an  arbitrary  trajectory  of  a  SAS  platfonn  can  be 
modeled  with  the  fast  ray  model  [13],  the  model  was  used  to  generate  LSAS  sonar  data,  with  a  target 
located  at  the  center  of  the  LSAS  path,  at  a  variety  of  ranges,  with  the  sonar  interface  elevated  3.8 
m  above  the  sediment,  corresponding  to  the  PondEX  and  TREX  data  collection  procedures.  This 
was  repeated  for  several  objects  present  in  the  testing  data,  a  list  of  which  and  their  characteristics 
is  given  in  Table  2. 
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Synthesized  AC  data  --  alcyl3ft1  --  21  m  Run 


Frequency  (kHz) 


Figure  3:  Acoustic  Color  data  for  3  ft  Aluminum  Cylinder  using  LSAS  Fast  Ray  Model 


f  (kHz)  f  (kHz)  f  (kHz) 

Figure  4:  AC  data  of  a  bullet-shaped  aluminum  UXO  replica  at  10  m  range  generated  via  (a) 
Data  collected  during  Pond  EX  10  (SERDP  MR-1665),  (b)  FEM  and  Kirchoff-Helmholtz  integral 
result,  (c)  Fast  ray  model  method  with  a  scattering  amplitude  derived  from  the  scattered  pressure 
computed  in  (b). 


AC  data  generated  for  template  signals  via  the  described  scattering  model  was  then  utilized  as 
training  data  for  an  implementation  of  our  MSC  classifier.  The  major  benefit  of  utilizing  the  ray 
model  developed  in  [13]- [15]  is  that,  after  a  free-held  scattering  amplitude  for  a  desired  object  is 
collected  or  modeled  via  FE  methods,  the  regeneration  of  the  ray  model,  simulating  various  aspects 
and  orientations,  is  far  simpler  than  re-running  these  variations  with  the  same  slow  FE  method 
[15].  A  great  deal  of  spectral  information  for  object  discernment  seems  to  be  present  in  these  AC 
plots,  but,  as  can  be  observed  from  Figure  4(c),  the  fast  ray  model  generated  AC’s  seem  to  preserve 
a  great  deal  of  the  expected  spectral  information  of  a  modeled  object  when  compared  to  the  AC 
data  created  from  real  sonar  soundings  of  the  same  object  in  Figure  4(a). 

The  procedure  for  generating  testing  AC  data  for  real  raw  sonar  data  from  the  PondEXlO 
experiment,  such  as  that  shown  in  Figure  4(a),  is  very  similar  to  the  procedure  performed  on 
the  synthesized  sonar  data  described  above.  A  raw  sonar  data  time  series  collected  from  the 
PondEXlO  experiment  is  first  matched  filtered  with  a  replica  of  the  transmit  signal  that  was  used 
in  the  experiment  and  this  new  pulse  compressed  data  is  further  filtered  to  remove  returns  from 
the  neighboring  objects  from  those  of  the  object  of  interest.  This  filtering  utilizes  a  reversible  SAS 
imaging  process,  a  spatial  filtering  process  using  a  2-D  Tukey  window  [16],  and  a  pseudo-inverse 
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filtering.  This  inverse  filter  maps  the  SAS  image  back  to  the  pulse  compressed  version  that  now 
has  less  interference  from  neighboring  objects.  These  filtered  pulse  compressed  signals  are  then 
transformed  to  the  frequency  domain  via  FFT.  This  process  is  repeated  for  all  aspects  of  LSAS  and 
the  power  spectrum  is  then  generated  and  plotted  to  display  AC  for  each  object  in  the  real  sonar 
data  sets. 

4.4  Classification  Using  Matched  Subspace  Classifier 

In  this  section,  we  give  a  review  of  the  theory  behind  the  MSC.  This  classifier  operates  on  the 
multidimensional  AC  data  vectors  described  above.  This  is  done  with  the  assumption  that  such 
vectors,  each  belonging  to  a  given  class,  can  be  formed  using  some  linear  combination  of  basis 
vectors  associated  with  that  class  (i.e.  signal  observations  take  the  form  of  the  classical  linear 
model).  In  our  case,  these  basis  vectors  consist  of  the  acoustic  frequency  responses  (or  AC)  of 
an  object  at  a  given  aspect  angle.  The  MSC  considered  in  this  work  is  a  special  instance  of  the 
Weighted  MSC  discussed  in  [2],  with  identity  weighting  matrix.  For  signal  subspace  training,  we 
use  the  standard  singular  value  decomposition  (SVD)  [17],  K-SVD  [3]  and  LP-KSVD  [4]  signal 
subspace  construction  methods.  The  standard  SVD  is  used  as  a  benchmark  as  it  is  the  suggested 
signal  subspace  choice  for  Matched  Subspace  Detectors/Classifiers  [18].  The  last  two  signal-specific 
dictionary  learning  methods  are  presented  in  Sections  4.4.1  and  4.4.2. 

Our  classification  problem  is  referred  to  as  an  M— ary  classification  problem  in  which  the  fol¬ 
lowing  hypothesis  is  to  be  tested:  Our  observation  contains  image  sources  from  M  possible  classes 
and  one  is  uniquely  most  likely  to  be  generating  the  signal.  For  a  simple  consideration  of  classifying 
‘UXO’  vs.  ‘non-UXO’,  M  =  2. 

We  will  begin  by  defining  the  M— ary  classification  problem  by  considering  m  =  0  ...  M  —  1 
hypotheses  each  satisfying  the  signal  model, 

Y  =  H„,0  +  N  (10) 

Here,  our  observation  matrix  is  represented  by  Y,  a  sum  of  the  signal  subspace  component  Hm@, 
where  Hm  £  WlNxK  is  a  matrix  whose  columns  are  basis  vectors  that  span  the  subspace  associated 
with  the  mth  object  class,  an  unknown  parameter  matrix  0  £  WKx<*,  and  an  additive  zero-mean 
noise  matrix  N  £  M.NxQ .  The  distribution  of  the  noise  matrix  is  unknown,  and  hence  it  is  not 
possible  to  derive  the  Maximum  Likelihood  Estimates  (MLE)  of  the  unknown  parameters  [2],  We 
can  equivalently  express  (10)  for  a  single  observation  vector  y  (or  qth  column  of  Y)  as, 

yq  =  H  m0q  +  nq  (11) 

where  is  the  qth  column  of  matrix  N  and  6q  is  the  qth  column  of  0.  The  model-based  AC 

th 

subspaces  Hm  can  be  constructed  for  the  mh  object  class  utilizing  different  subspace  reconstruction 
methods.  Here,  we  used  the  SVD  [17],  K-SVD  [3]  and  and  LP-KSVD  [4]  dictionary  construction 
methods. 

The  core  idea  behind  the  weighted  MSC  is  the  implicit  suppression  of  large  amplitude  residuals 
by  weighting  each  of  the  row  residual  terms  in  the  discriminant  function 

Jm  =  tr{(Y  -  Hm0)TW(Y  -  Hm©)}  =  ||W^(Y  -  Hm0)||£,  Vm  £  [1,  ...,M]  (12) 

where  W  is  a  diagonal  matrix  with  weights  along  the  diagonal,  corresponding  to  row-weighting, 
and  ||A||j,  represents  the  squared  Frobenius  norm  of  matrix  A  which  is  ||A|||,  =  tr{AAT}  .  For 
a  given  weight  matrix  W,  the  weighted  least-squares  estimate  of  0  under  the  mth  hypothesis  is 
found  using 

©  =  (H^WHm)-1H^WY.  (13) 
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Using  0  in  (13)  we  can  get  the  value  of  the  discriminant  function  in  (12)  at  this  solution  i.e. 

Jm  =  tr{(YTW(I  —  Em)Y)}  (14) 

where  Em  is  an  oblique  projection  onto  subspace  (Hm)  i.e.  E m  =  Hm(H^ ,WHm)_1H^W.  This 
discriminant  function  can  be  reinterpreted  as  the  orthogonal  projection  of  the  weighted  data  matrix 
Z  =  W1/2Y  onto  subspace  (W1/2Hm),  or  equivalently,  zq  =  W1/2y(?,  for  single  vector  observation 
€  Y: 


Jm  =  tr{(ZT(I-Piyl/2Hm)Z)}  (15) 

where  Pwi/2H  =  W1/2Hm(H^WHm)_1(W1^2Hm)T  is  the  projection  matrix  onto  the  subspace 
spanned  by  the  columns  of  matrix  Wl//2Hm.  This  weighted  MSC  classifier  assigns  a  class  label  to 
an  observation  vector  yk  based  on  the  following  criterion, 


m*  =  arg  min  Jm  =  arg  min  {zj.  ( I  —  P 


me[l,M] 


W?Hn 


ZJ- 


(16) 


Figure  5:  Weighted  Matched  Subspace  Classifier  geometric  perspective  [2] 

As  Figure  5  and  equation  (16)  indicate,  this  classifier  measures  the  energy  in  each  of  the  sub¬ 
spaces  (W1/2!!^)  and  selects  the  class  label  corresponding  to  the  subspace  that  contains  the 
largest  amount  of  energy.  If  the  noise  vector  rig  is  assumed  to  be  normal  with  covariance  matrix 
E  [n9n^]  =  cr2I,  then  the  criterion  given  in  (16)  also  corresponds  to  the  decision  that  minimizes  the 
probability  of  assigning  an  erroneous  class  label.  It  is  important  to  note  that  for  W  =  I,  this  MSC 
reduces  to  the  standard  Matched  Subspace  Classifier/Detector  [18],  which  builds  our  classification 
approach  throughout  this  work. 

4.4.1  Signal  Subspace  Generation  using  K-SVD  Method 

Using  the  fast-ray  model  provided  by  APL-UW,  a  large  database  of  AC  signals  was  created  matching 
aspect  resolution  and  frequency  resolution  of  those  generated  for  the  real  sonar  data.  Using  this 
abundance  of  model  signals,  the  K-SVD  [3]  subspace  construction  method  was  implemented  forming 
various  Hm’s  for  the  two  classes.  A  brief  description  of  this  method  is  given  in  this  subsection. 

The  purpose  of  K-SVD  is  to  create  an  optimal  signal-dependent  dictionary  that  reduces  the 
dimension  of  a  signal  vector  by  representing  it  as  a  sparse  linear  combination  of  relatively  few 
atoms.  More  specifically,  K-SVD  aims  to  solve  a  constrained  minimization  problem  to  reduce  the 
reconstruction  error  in  a  set  of  training  vectors.  Let  Ym  £  M>NxQ  be  a  matrix  consisting  of  class 
m  (m  =  0,1)  training  data  vectors  yqn>  from  Section  5. 2. 1.1  for  q  £  [1  •  •  •  Q\  as  its  columns, 
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II,,,  G  WNxR  be  the  dictionary  matrix  to  be  found,  and  Xm  G  WKx®  be  the  sparse  representation 
of  Ym  in  terms  of  its  dictionary  atoms.  Note  it  is  desired  that  the  number  of  non-zero  elements 
of  each  x^ml  be  substantially  less  than  N  as  the  dimension  should  be  reduced  in  this  process.  The 
constrained  optimization  problem  [3]  is  given  by, 

min  {||Ym  -  HmXm|||*}  subject  to  ,  ||x(m)||0<T,  \/q  (17) 

-H-m  ,-A.ttx 

where  ||  •  ||^  is  the  Frobenius  norm  of  a  matrix,  and  ||  •  ||o  is  the  £$  norm  which  counts  the  non-zero 
elements  of  a  vector. 

During  the  training,  the  K-SVD  algorithm  is  composed  of  two-phases.  First,  a  sparse  represen¬ 
tation  phase  where  for  each  y^"'1  the  corresponding  x^”"1  is  computed  based  on  a  given  Hm  using 
a  pursuit  method  such  as  Orthogonal  Matching  Pursuit  (OMP)  [19].  Second,  a  dictionary  update 
phase  where  h/,  G  H  m  is  updated  based  on  minimizing  the  reconstruction  error  using  the  SVD  of  a 
restricted  error  matrix  E^.  The  steps  of  this  procedure  are  outlined  in  Table  1.  These  two  phases 
are  repeated  until  convergence  through  monotonic  MSE  reduction,  a  more  detailed  description  of 
which  is  given  in  [3]. 


Table  1:  K-SVD  Algorithm 


K-SVD  Optimal  Dictionary  Construction  Algorithm: 

Initialization:  Set  the  dictionary  matrix  G  M>NxK  with  K  randomly  se- 
lected  Z2  normalized  columns  of  Ym.  Set  J  =  1.  Repeat  following  steps  until  a 
stopping  rule  is  met. 

Sparse  Coding  Stage:  Generate  Xm  by  computing  the  sparse  representation 
Xq,n)  for  each  y,m^  based  on  Hm  using  the  Fast-OMP  pursuit  method  [19]. 

Dictionary  Update  Stage:  Each  column  hj,,  k  G  [1  in  is  up¬ 

dated  by: 

1.  Compute  k— exclusive  error  matrix  E&  =  Ym  —  Ylj^k  h.jx.7*>  where  Xj*  is 
the  jth  row  of  Xm. 

2.  Define  column  indices  of  training  data  Ym  that  use  the  kth  atom  in  their 
reconstruction  via  Hm:  uik  =  {*  |1  <  *  <  Q,  Xk*(i)  7^  0}. 

3.  Compute  Ef  and  x^,  the  restricted  error  matrix  and  coefficient  vector 
respectively,  by  selecting  only  columns  of  E corresponding  to  uik  indices 
and  likewise  for  entries  of  xj,*  (i.e.  discard  zero  entries  in  the  row  vector). 

4.  Apply  SVD:  Ef  =  USVT.  The  updated  dictionary  column  hj,  is  the  first 
column  of  U  and  the  updated  coefficient  vector  xR  =  viS(l,l),  where 
S(l,  1)  is  the  first  and  largest  singular  value  in  the  SVD  of  E^. 


Set  J  =  J  +  1  and  repeat  until  convergence. 


4.4.2  Locality  Preserving  Extension  of  K-SVD 

In  Locality  Preserving  K-SVD  or  LP-KSVD  method  [4],  the  goal  is  to  preserve  the  local  geometry 
of  a  non-linear  manifold  while  determining  the  optimal  linear  support  Hm.  For  our  problem,  it  is 
assumed  that  a  given  AC  data  point  yq  resides  on  a  non-linear  but  smooth  manifold  describing  all 
the  AC  data  points.  This  method  is  an  unsupervised  method  which  attempts  to  learn  discriminative 
atoms  hfc  by  forcing  their  updates  to  be  made  solely  from  samples  in  their  local  neighborhood 
on  the  manifold.  There  are  two  main  benefits  to  the  LP-KSVD  approach  in  comparison  with 
the  K-SVD  [3].  First,  the  employed  local  coding  has  closed- form  solution  which  makes  it  more 
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computationally  efficient  compared  to  sparsity  driven  algorithms  such  as  the  OMP  used  in  the 
K-SVD  method.  Secondly,  the  dictionary  is  optimized  for  both  its  capability  in  representation 
and  its  locality  preservation  property,  which  is  in  contrast  to  sparse  coding  which  only  solves  for  a 
representational  dictionary  [4].  In  this  section,  we  will  give  a  brief  outline  of  the  theory  motivating 
the  LP-KSVD  method. 

To  begin,  there  are  two  main  objectives  of  LP-KSVD.  First,  is  establishing  a  compact  dictionary 
H m  =  [hi,  h2, ...,  h/f]  £  M.NxK  such  that  linear  combinations  of  hj’s  can  approximate  the  nonlinear 
manifold  JA  c  Mw.  Since  we  have  no  access  to  the  true  A4,  Hm  is  estimated  based  on  Ym  (here 
Y.m  is  the  training  samples  for  the  mth  class).  The  second  objective  is  learning  hj’s  as  landmark 
points,  which  are  capable  of  preserving  the  locality  on  JA.  This  dictionary  learning  problem  can 
be  formulated  as: 


min  1 1  Ym  HmXm||^  (18) 

ti-m 

\  Xij  =  0  if  hj  ^  ttT(yj)  Vi,  j 
\l/xj  =  l  Vj 

where  the  reconstruction  error  term  measures  the  fitness  of  H,„  to  Ym  similar  to  K-SVD  in  (17);  the 
matrix  Xm  £  x®  contains  Q  local  reconstruction  codes,  with  Xj  being  the  code  for  reconstructing 
y j  in  terms  of  H m,  and  DT(y  •)  denotes  the  neighborhood  containing  r  nearest  dictionary  atoms  of 
y  ■  in  terms  of  Euclidean  distance  (i.e.  i 2  norm).  The  first  constraint  dictates  that  every  training 
sample  y  •  can  only  be  re-constructed  by  its  r  nearest-neighbor  dictionary  atoms.  The  second 
constraint  allows  the  reconstruction  coefficients  to  be  invariant  to  translation  of  the  data.  The 
main  goal  is  to  choose  hj  to  be  sufficiently  close  to  JA.  Moreover,  to  learn  hj’s  as  landmark  points, 
it  is  further  required  that  each  hj  be  locally  representative  with  respect  to  a  small  patch  on  the 
manifold  JA. 

The  proposed  LP-KSVD  solves  (18)  iteratively  by  alternating  between  the  two  variables  Hm 
and  Xm.  This  is  done  by  first  fixing  Hm  and  solve  for  the  best  co-efficient  matrix  Xm  and  then,  we 
update  H.m  as  well  as  Xm  together.  Iterations  are  terminated  if  either  the  objective  function  value 
is  minimized  below  some  preset  threshold  or  a  maximum  number  of  iterations  has  been  reached. 
The  steps  for  solving  the  local  reconstruction  codes  and  local  dictionary  optimization  are  described 
in  details  in  [4]. 

5  Results  and  Discussion 

5.1  Broadband  Coherence  Detector  Results 

In  this  section,  we  provide  results  of  the  broadband  coherence  statistic  given  in  (4)  when  applied 
to  the  PondExlO  dataset  collected  at  the  NSWC  -  Panama  City,  FL.  Moreover,  the  results  of  this 
detector  are  compared  and  benchmarked  to  the  Hadamard  ratio  given  in  (3)  which  does  not  rely 
on  the  wide  sense  stationary  assumption  when  constructing  the  likelihood  ratio.  For  a  review  of 
the  dataset  used  in  this  section,  the  reader  is  referred  to  [20]. 

5.1.1  Formation  of  the  Test  Statistic 

To  form  detection  decisions,  each  sonar  return  at  a  particular  ping  within  the  run  was  first  matched- 
filtered  by  correlating  the  received  waveform  with  the  LFM  transmit  signal.  Each  matched  filtered 
signal  was  then  partitioned  into  non-overlapping  windows  of  length  N  =  281  (cross-track)  corre¬ 
sponding  to  a  range  resolution  of  approximately  0.25m.  For  every  0.25m  in  the  direction  of  the 
rail  system  (along-track) ,  the  time  series  collected  over  a  P  =  100  ping  window  were  appropriately 
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Rail  system  (b)  Effects  of  Delay  on  the  Matched  Filter  Response 


(a)  Geometry  of  Data  Collection 

Figure  6:  The  matched  filtered  signals  associated  with  each  ping  are  delayed  to  account  for  increases 
in  path  length  as  the  array  moves  along  the  rail. 
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Figure  7:  Processing  of  the  data  to  form  the  likelihood  ratio  involves  filtering  the  time  series 
collected  by  each  sensor  element  into  two  frequency  bands,  using  all  P  pings  to  estimate  the  cross- 
spectral  matrix  S(e3° ,  eJ<3),  and  finally  computing  A  by  integrating  coherence  over  the  Nyquist 
band. 


lagged  to  account  for  the  delay  corresponding  to  the  increase  in  path  length  as  the  array  moves 
along  the  rail  system.  Figure  6  (a)  gives  the  geometry  of  this  procedure  where,  if  the  rrfi1  ping 
exhibits  a  range  of  rm  from  the  area  of  the  target  field  that  is  of  interest  and  there  exists  a  ping 
separation  of  5,  then  the  range  to  the  target  area  at  ping  m  +  p  is  simply  rm+p  =  \Jr‘jn  +  (p<5)2- 
This  increase  in  path  length  can  then  be  used  to  determine  the  relative  delay  between  each  ping 
and  one  may  then  account  for  this  accordingly.  Figure  6  (b)  gives  an  example  of  the  matched 
filtered  response  from  a  target  both  before  and  after  accounting  for  relative  time  delays.  After 
accounting  for  time  delay,  Figure  7  gives  a  block  diagram  of  the  subsequent  processing  steps  used 
in  the  formation  of  the  likelihood  ratio  for  each  0.25  X  0.25  m  location  within  the  target  field.  As 

shown  in  this  figure,  the  time  series  for  each  sensor  element  and  for  each  ping  is  first  filtered  into 

th 

L  =  2  non-overlapping  frequency  bands  using  two  8L  -order  Butterworth  filters,  one  a  lowpass  filter 
to  remove  high  frequency  content  and  the  other  a  highpass  filter  to  likewise  remove  low  frequency 
content.  This  filtering  is  done  independently  for  all  M  =  5  sensor  elements  in  the  array  and  the 
resulting  time  series  are  vectorized  to  form  the  vector  x,;[p]  for  the  ping.  This  is  done  for  all 
P  =  100  pings  and  the  data  from  both  frequency  bands  is  concatenated  to  form  the  data  matrix  Z 
given  in  (2).  A  2-dimensional  Fast  Fourier  Transform  (FFT)  is  then  applied  to  the  data  from  both 
frequency  bands  and  the  estimated  cross-spectral  matrix  S(e3e,e3<^)  given  in  (6)  is  computed.  The 
likelihood  ratio  given  in  (4)  is  finally  formed  by  numerically  integrating  the  frequency/wavenumber 
Hadamard  ratio  using  a  trapezoidal  rule  approximation. 
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Figure  8:  Receiver  Operating  Characteristic  (ROC)  curve  for  the  PondExlO  dataset. 
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Figure  9:  Beamformed  SAS  and  likelihood  ratio  images  for  both  detection  methods  at  0°  object 
orientation. 


5.1.2  Detection  Results 

The  broadband  coherence  detector  given  in  (4)  was  applied  to  all  10  runs  of  the  PondExlO  dataset 
and  compared  to  the  Hadamard  ratio  given  in  (3).  Using  all  50  objects  in  the  dataset  (10  runs  x  5 
objects  per  run)  as  well  as  a  randomly  selected  set  of  locations  corresponding  to  background,  Figure 
8  displays  the  ROC  curve  for  both  detection  methods.  This  curve  is  formed  by  varying  the  threshold 
of  the  detector  over  a  range  of  values  and,  at  each  threshold,  comparing  the  percentage  of  targets 
that  are  detected  to  the  average  number  of  background  realizations  per  image  that  are  mis-detected. 
From  this  figure  it  is  clear  that,  by  taking  advantage  of  the  WSS  assumption,  the  broadband 
coherence  detector  outperforms  the  Hadamard  ratio.  One  can  also  see  that  the  broadband  coherence 
detector  performs  well  at  discriminating  UXO  from  background  as  the  proposed  technique  detects 
all  of  the  objects  after  only  about  an  average  of  one  false  alarm  per  image.  The  broadband  coherence 
detector  also  exhibits  a  knee-point  (the  point  where  Pd  +  Pfa  =  1  which  is  shown  by  a  small  circle  in 
Figure  8)  at  a  probability  of  detection  of  Pd  =  99%.  Compared  to  the  results  of  the  detector  from 
the  previous  report  [20],  one  can  see  that  this  new  version  of  the  broadband  coherence  detector 
exhibits  a  3%  improvement  in  Pd  at  the  knee-point  of  the  ROC  curve  over  that  of  its  previous 
version. 

Figures  9  (a)  and  10  (a)  show  the  beamformed  SAS  images  generated  using  k-omega  beamform¬ 
ing  for  a  run  with  0°  object  orientation  and  one  with  80°  orientation,  respectively.  Note  that,  at 
0°  orientation  the  objects  are  broadside  while  a  90°  orientation  corresponds  to  situation  where  the 
nose  of  the  objects  are  pointed  toward  the  array.  Figures  9  (b)  and  10  (b)  likewise  show  images 
of  the  broadband  coherence  statistic  given  in  (4)  computed  at  every  0.25  x  0.25  m  location  in  the 
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Figure  10:  Beamformed  SAS  and  likelihood  ratio  images  for  both  detection  methods  at  80°  object 
orientation. 


target  field  for  this  two  runs.  Likewise,  Figures  9  (c)  and  10  (c)  show  images  of  the  likelihood  ratio 
for  the  Hadamard  ratio  detector  given  in  (3).  From  these  images  one  can  see  that  both  methods  are 
able  to  detect  the  objects  with  relative  ease  when  at  an  orientation  of  0°  but  have  more  difficulty 
when  the  objects  are  oriented  at  80°.  However,  by  comparing  the  likelihood  ratios  of  the  broadband 
coherence  and  Hadmard  ratio  detectors,  one  can  once  again  visually  observe  that  the  broadband 
coherence  detector  does  a  better  job  of  discriminating  target  from  background. 

5.2  MSC  Classifier  Results 

In  this  subsection,  we  provide  results  of  the  MSC  when  tested  on  the  PondEX09,  PondEXlO 
and  TREX13  datasets.  First,  brief  reviews  of  the  training  and  testing  datasets  will  be  provided. 
Following  these  discussions,  we  present  the  performance  of  the  MSC  when  tested  on  PondEX09  and 
PondEXlO  datasets  using  MSC  with  signal  subspace  components  extracted  via  the  regular  SVD, 
K-SVD,  and  LP-KSVD  methods.  This  is  repeated  for  the  TREX13  objects  in  two  different  range 
groupings. 

5.2.1  Description  of  Training  and  Testing  Datasets 

5.2. 1.1  Fast-Ray  Model  Generated  Dataset  —  Training  The  Fast-Ray  model  can  be  uti¬ 
lized  to  create  AC  data  for  a  variety  of  environments  and  simulated  runs.  Using  the  model  and 
procedures  described  in  Sections  4.2  and  4.3,  raw  sonar  runs  for  10  different  objects  used  in  the 
PondEX  and  TREX  experiments  were  modeled.  The  modeled  runs  were  designed  to  replicate  the 
condition  of  the  PondEX  and  TREX  data.  In  particular,  these  runs  were  generated  for  a  21  nr 
path  length  at  horizontal  ranges  of  10  nr  and  30  nr,  with  the  source/receiver  interface  elevated  3.8 
m  above  the  sediment.  These  synthetic  sonar  datasets  and  their  corresponding  AC  data  were  gen¬ 
erated  for  two  different  environments.  Both  modeled  environments  simulated  water  sound  speeds 
matching  those  that  were  used  in  the  generation  of  the  scattering  coefficients  in  (8)  which  were 
generated  via  FE  method.  Similar  to  the  testing  data,  object  rotations  from  —80°  to  +80°  in  20° 
increments,  were  generated  to  give  AC  data  for  360°  of  aspect  for  each  object  and  environment. 
This  was  then  repeated  for  the  two  ranges  we  observed  in  TREX13  data  collection,  10m  and  30m. 
It  must  be  mentioned  that  dictionaries  were  trained  for  each  of  the  test  sets  separately,  with  FRM 
training  data  that  matched  the  appropriate  range  and  included  only  the  objects  viewed  in  that 
experiment. 

For  each  object,  the  AC  data  generated  was  decimated  along  the  frequency  dimension  to  have 
N  =  310  frequency  bins  spanning  the  0  —  31  kHz  frequency  band  corresponding  to  approximately 
100  Hz  separation  of  frequency  bins.  After  generating  AC’s  at  stops  that  one  might  encounter  in 
a  21  m  run,  each  training  set  was  uniformly  sub-sampled  to  a  set  containing  l/7th  of  all  possible 
aspects  of  our  synthetic  training  data.  Using  these  synthetic  sonar  datasets,  signal  subspaces  were 
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Table  2:  Fast  Ray  Model  training  data  -  experiment  objects  for  which  FRM  exists 


No. 

Class 

FRM  Name 

Object  Description 

1 

non-UXO 

alcyl2ft 

2  ft  Aluminum  Cylinder 

2 

non-UXO 

alcyl3ft 

3  ft  Aluminum  Cylinder 

3 

non-UXO 

alpipe 

2  ft  Aluminum  Pipe  Section 

4 

UXO 

aluxo 

100  mm  Aluminum  Rocket  Round 

5 

UXO 

ssuxo 

100  mm  Solid  Steel  Rocket  Round 

6 

UXO 

bullet  105mm  air 

105  mm  Bullet  (Air  Filled) 

7 

UXO 

bullet  _105mm_h2o 

105  mm  Bullet  (H2O  Filled) 

8 

UXO 

howitzer  _cap_air 

155  mm  Howitzer  with  Cap  (Air  filled) 

9 

UXO 

howitzer  _cap_h2o 

155  mm  Howitzer  with  Cap  (H2O  filled) 

10 

UXO 

howitzer  _nocap 

155  mm  Howitzer  no  Cap 

trained  using  the  methods  described  in  Sections  4.4.1  and  4.4.2.  For  the  trained  dictionaries  to  be 
used  in  Pond  testing,  the  training  data  was  truncated  to  the  frequency  bins  of  the  incident  LFM 
signal  power,  spanning  1  —  30.5  kHz,  hence  leading  to  AC  vectors  with  Nponci  =  295  dimensions. 
Similarly,  for  the  TREX  dictionaries,  since  the  incident  LFM  was  in  the  3  —  30.5  kHz  frequency 
range,  the  resulting  truncated  AC  vectors  were  Ntrex  =  275  dimensional.  Dictionaries  of  sizes 
H  pon(i  £  R295xA  5  Hr  rex  £  r27’5xK'  (for  £gm  anc}  3()rn  ranges)  were  generated  using  sparsity  factor 
r  =  10  for  both  LP-KSVD  and  KSVD  methods.  Here  I\  =  400  for  non-UXO  class  and  K  =  600 
for  the  UXO  class.  As  was  mentioned  before,  and  in  contrast  to  the  SVD  method,  in  K-SVD  and 
LP-KSVD  testing,  the  sparsity  of  reconstruction  codes  was  enforced  by  applying  OMP  algorithm. 
The  number  of  iterations  used  in  training  was  chosen  to  be  30  as  this  was  found  to  be  acceptably 
close  to  minimum  achievable  reconstruction  error.  Table  2  enumerates  and  briefly  describes  each  of 
the  training  data  objects  used.  For  the  SVD  subspace  training,  the  top  30  non-UXO  eigenvectors, 
associated  with  the  30  largest  eigenvalues,  of  the  non-UXO  training  set  were  kept  whereas  for  the 
UXO  training  set,  the  top  50  eigenvectors  were  kept.  Objects  1,  3,  4,  and  5  were  only  observed  in 
the  PondEX  experiments  while  all  10  objects  were  present  in  the  TREX13  test  set. 

5. 2. 1.2  PondEX  Datasets  —  Testing  Figure  11  shows  the  layout  of  the  PondEX09  and  Pon- 
dEXlO  experiments  including  the  relative  locations  of  the  rail-mounted  sonar  system  and  the  place¬ 
ment  of  objects  in  the  target  field.  The  21m  rail  the  sonar  system  is  mounted  on  is  fixed  to  eliminate 
platform  motion  as  the  sonar  interface  moves  along  its  track.  The  sonar  transmit  signal  is  a  6  msec 
LFM  over  0  —  31  kHz  with  a  10%  taper  between  the  leading  and  trailing  edges  to  minimize  ringing 
in  the  transmitted  signals.  Sonar  backscatter  is  received  with  L  =  6  hydrophone  elements  that  are 
arranged  in  a  linear  array  normal  to  the  seafloor. 

As  can  be  seen  from  Figure  11,  the  target  field  in  PondEXlO  contained  five  objects  at  a  time, 
all  of  which  were  located  approximately  10  m  from  the  rail  system  and  are  proud  on  the  sandy 
bottom.  For  each  run  that  was  made  in  the  PondEX  data  collection,  the  orientation  of  the  axes 
of  symmetry  for  all  the  observed  objects  were  the  same  relative  to  the  fixed  sonar  rail.  Nine  total 
object  orientations  were  used,  ranging  from  —80°  to  +80°  in  20°  increments,  where  the  run  with  0° 
object  orientation  designates  a  configuration  where  an  object’s  major  axes  of  symmetry  is  parallel 
to  the  rail  system.  Each  run  of  data  consists  of  769  pings  in  which  the  sonar  platform  moved  along 
the  fixed  rail  in  increments  of  0.025  m,  transmitting  and  receiving  once  for  each  fixed  position.  The 
data  was  sampled  at  1  MHz  and  the  sonar  platform  was  tilted  at  a  fixed  20°  grazing  angle  for  all 
runs  (angle  of  the  sonar  main  response  axis  with  respect  to  the  horizontal  plane). 

Since  the  useful  spectral  information  in  the  collected  data  has  a  Nyquist  frequency  well  below 
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Figure  11:  Layout  of  the  target  fields  for  PondEx09  and  EX10  data  collection  [21] 
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Figure  12:  Inverse  SAS  filtering  [14] 


the  sampled  rate,  the  1  MHz  data  was  down-sampled  by  a  factor  of  10  resulting  in  effective  sampling 
frequency  of  100  kHz.  Owing  to  the  fact  that  the  LFM  incident  signal  for  PondEX  experiments 
has  a  range  of  frequency  1  —  30.5  kHz,  only  those  frequency  bins  were  retained  resulting  in  AC 
feature  vector  dimension  of  Npon(i  =  295.  In  the  PondEX  experiments  since  the  objects  are  laid 
out  very  close  together,  to  generate  AC  data  for  each  individual  object  free  of  the  influence  of 
the  neighboring  ones,  a  procedure  using  the  Inverse-SAS  filtering  process  [14]  is  utilized.  This 
procedure  is  depicted  in  Figure  12  which  shows  this  inverse-SAS  filtering  to  remove  neighboring 
object  returns  from  the  pulse-compressed  data. 

As  this  classifier  is  applied  to  the  output  of  the  previously  developed  broadband  coherence-based 
detector  [22],  the  detected  regions  of  interest  (ROI)  that  contained  objects  of  interest  were  tested 
by  the  classifier.  A  hard-limiting  power  threshold  was  implemented  in  gathering  test  aspects  from 
detected  regions  of  interest.  This  threshold  selects  observations  from  the  filtered  runs  by  summing 
the  squared  frequency  bin  components  in  a  single  AC  aspect  and  determining  the  total  power  of 
the  observation.  Those  aspects  that  meet  the  hard-limiting  threshold  are  kept.  For  too  aggressive 
threshold,  the  top  5%  of  aspects  will  be  chosen.  That  is,  at  least  5%  of  aspects  in  detected  ROI’s 
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Table  3:  Objects  in  the  Pond  EX09-10  experiment  testing  data 


FRM  Name 

#  Runs/ Aspects 

alcyl2ft 

10/3282 

alpipe 

10/1580 

aluxo 

10/315 

ssuxo 

10/717 

Total  #  Tested: 

5894 

are  picked  for  testing.  One  might  note  the  difference  in  prevalence  of  class  samples.  This  is  also 
due  to  this  power  threshold.  It  was  found  that  for  the  Aluminum  Cylinder  and  Aluminum  Pipe, 
many  views  of  the  object  would  return  soundings  above  the  set  threshold  that  isolated  the  best 
UXO  type  soundings. 

5.2. 1.3  TREX13  Dataset  Testing  The  same  rail  system  described  in  the  PondEx  experi¬ 
ments  was  utilized  in  the  TREX13  data  collection  process.  However,  the  environmental  conditions 
were  vastly  different  as  the  TREX13  dataset  was  collected  in  the  bay  area  off  of  Panama  City 
coast.  The  LFM  incident  signal  used  in  the  TREX13  dataset  had  power  in  the  frequency  range 
of  3  —  30.5  kHz.  Thus,  using  the  same  frequency  sampling,  the  AC  feature  vector  dimension  for 
TREX13  becomes  Ntrex  =  275,  i.e.  truncating  the  original  N  =  310  frequency  bins  spanning 
0  —  31  kHz,  to  those  that  span  3  —  30.5  kHz.  Additionally,  the  same  Inverse-SAS  filtering,  AC 
generation,  and  hard-limiting  threshold  processes  described  in  Section  5. 2. 1.2  were  implemented  to 
obtain  filtered  data  AC  portions  for  each  object  in  the  TREX  test  set.  Table  4  lists  those  objects 
and  ranges  for  which  Inverse-SAS  filtered  runs  exist.  Ranges  10m  and  30m  were  used  in  our  testing 
to  further  validate  the  performance  of  the  classifier  on  data  from  a  more  realistic  environment  when 
compared  to  PondEX  dataset. 

Table  4:  TREX13  available  target  ranges 


No. 

TREX  # 

FRM  Name 

5m 

10m 

15m 

20m 

25m 

30m 

35m 

40m 

1 

Target  17 

alcyl2ft 

X 

2 

Target  7 

alcyl3ft 

X 

X 

X 

3 

Target  16 

alpipe 

X 

X 

X 

4 

Target  20 

aluxo 

X 

X 

X 

X 

5 

Target  21 

ssuxo 

X 

X 

X 

X 

6 

Target  25 

bullet_105mm_air 

X 

X 

X 

X 

7 

Target  29 

bullet_105mm_h2o 

X 

X 

X 

8 

Target  9 

howitzer  _cap_air 

X 

X 

X 

X 

9 

Target  28 

howitzer  _cap_h2o 

X 

X 

X 

10 

Target  8 

howitzer  _nocap 

X 

X 

X 

X 

5.2.2  Classification  Results  and  Analysis 

5. 2. 2.1  PondEX  Testing  Results  To  make  a  classification  decision  for  a  given  sonar  ping 
observation,  the  corresponding  AC  data  vector  of  dimension  Npon(i  =  295  that  contains  the  spec¬ 
tral  features  of  an  underwater  UXO  or  non-UXO  object  at  a  particular  aspect  is  applied  to  the 
classifier.  For  the  MSC,  this  decision-making  is  implemented  using  the  classification  rule  given  in 
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(16).  The  same  process  can  be  carried  out  for  all  object  aspects  in  a  run  in  the  PondEX09  and 
PondEXlO  testing  dataset.  However,  using  the  more  general  decision  rule  (15)  for  MSC,  one  can 
make  decision  on  an  augmented  data  matrix  Y  that  contains  multiple  AC  observation  vectors  at 
different  aspects.  This  multi-aspect  decision-making  provides  better  opportunity  to  discriminate 
between  UXO  and  non-UXO  object  classes.  Moreover,  multi-aspect  classification  is  more  amenable 
to  actual  operational  situations  where  several  views  from  an  underwater  objects  are  received.  Here, 
we  use  three  aspects  of  an  object  to  perform  classification  decision.  Although,  in  the  PondEX  ex¬ 
periments  aspect  separation  is  uniform  due  to  the  rail  system,  to  account  for  platform  instability  in 
a  realistic  data  collection  scenario  aspect  separation  is  modeled  by  a  uniformly  distributed  random 
variable  s  ~  unif  {8, 16}  which  shuffles  data,  beginning  with  the  first  aspect  in  a  given  run  that 
meets  the  power  threshold.  Note  that  the  procedure  does  not  shuffle  between  rotations  of  objects, 
only  the  order  in  which  the  aspects  of  a  single  linear  run  are  encountered  is  changed. 

ROC  for  Various  TestsAsp/obs  =  3  roc  for  Various  TestsAsp/obs  =  3 


(a)  ROCs  for  three  methods  (3  Asp/dec.)  (b)  ROCs  for  LP-KSVD  on  Pond  data  (3  Asp. /dec.) 

Figure  13:  ROCs  for  Pond  data 

Figures  13a  and  13b  give  the  Receiver  Operating  Characteristic  (ROC)  curves  of  the  classi¬ 
fier  for  a  three-aspect  decision  and  using  50  Monte  Carlo  trials  (one  for  each  random  three-aspect 
combination  with  a  fixed  initial  aspect)  for  SVD,  K-SVD,  and  LP-KSVD  dictionaries.  More  specif¬ 
ically,  Figure  13a  gives  the  average  performance  over  50  Monte  Carlo  trials  in  which  each  filtered 
run  (single  rotation)  dataset  is  grouped  into  3  aspects  with  a  uniformly  distributed  separation,  as 
mentioned  before.  As  can  be  observed  by  the  3  blue  circles  in  Figure  13a,  the  ROC  curves  for  the 
SVD,  K-SVD,  and  LP-KSVD  exhibit  knee-point  (the  point  where  Pcc  +  Pfa  =  1)  probability  of 
correct  classification  of  Pcc, SVD  ~  85.41%,  Pcc,ksvd  ~  90.07%,  Pcc, lp-ksvd  ~  93.20%  proba¬ 
bility  of  false  alarm  Pfa, SVD  ~  14.59%,  Pfa,ksvd  ~  9.93%,  Pfa, lp-ksvd  ~  6.80%,  respectively. 
These  results  show  that  the  MSC  performs  very  well  in  discriminating  ‘UXO’  vs.  ‘non-UXO’  in 
the  PondEX  dataset  in  spite  of  the  fact  that  there  are  indeed  obvious  discrepancies  between  the 
model  data  used  for  training  and  the  AC  sonar  data  of  actual  objects.  Additionally,  the  best 
overall  results  for  the  PondEX  datasets  are  obtained  for  a  classifier  trained  and  tested  using  data 
from  LP-KSVD  dictionary  learning  method.  Figure  13b,  on  the  other  hand,  shows  the  best  and 
worst  performance  over  50  trials  for  this  method.  This  ROC  plot  indicates  that  the  classification 
performance  is  indeed  statistically  consistent  over  50  trials. 

Table  5  displays  the  confusion  matrix  for  the  MSC.  It  was  found  that  the  most  common  type 

I  error  (i.e.  ‘UXO’  mis-classification)  occurred  for  the  Stainless  Steel  UXO  object  (Table  3,  object 
4),  which  was  most  commonly  mis-classified  as  2  ft  Aluminum  Cylinder.  All  the  observed  type 

II  errors  (i.e.  False  Alarms)  occurred  for  samples  of  Aluminum  Pipe  object,  which  were  almost 
equally  mis-classified  as  the  Aluminum  and  Stainless  Steel  UXO  classes.  These  results  suggest 
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that  the  inclusion  of  specific  misleading  aspects  in  the  training  of  subspaces  could  potentially  have 
a  significant  influence  on  the  power  of  FRM  data  in  representing  different  classes.  Additionally, 
the  results  in  Figure  13a  and  Table  5  reveal  much  improved  performance  of  our  approach  when 
compared  to  the  results  of  previously  tested  classifiers  [14],  including:  kernel  matching  pursuit, 
support  vector  machine,  and  relevance  vector  machine,  which  at  best  demonstrated  target  vs.  non¬ 
target  discrimination  with  Pec  ~  90%  and  Pfa  ~  10%.  However,  in  these  previous  cases  the 
classifiers  were  trained  on  real  sonar  data  whereas  here  training  was  exclusively  done  on  model¬ 
generated  datasets  via  the  fast  ray  model  [13] . 

Table  5:  Confusion  matrix  of  classified  signals  (LP-KSVD) 


Truth-Decision 

‘non-UXO’ 

‘UXO’ 

alcyl2ft 

1.0000 

0 

alpipe 

0.9681 

0.0319 

aluxo 

0.0297 

0.9703 

ssuxo 

0.2103 

0.7897 

5. 2. 2. 2  TREX13  Fixed  Range  Testing  Results  The  same  procedure  described  for  the  Pon- 
dEX  datasets  was  repeated  for  TREX13  datasets  for  the  10m  and  30m  runs  and  for  all  testing 
datasets  for  which  FRM  training  data  exist  (See  Table  4).  Note  that  for  the  30m  runs,  the  classifier 
had  to  be  retrained  to  include  the  corresponding  model-generated  data.  Additionally,  as  mentioned 
before  for  the  TREX13  datasets  the  AC  feature  vector  is  N  =  275  dimensional.  Again,  the  aspects 
of  a  single  rotation  run,  which  met  the  power  threshold,  were  ordered  starting  with  the  first  high 
powered  aspect  and  making  aspect  separation  s  ~  unif  {8, 16}. 

Figure  14b  shows  the  ROC  curves  for  a  three-aspect  MSC  on  the  10m  run  datasets  using  50 
Monte  Carlo  trials  for  the  SVD,  K-SVD,  and  LP-KSVD  dictionaries.  The  ROC  knee-point  perfor¬ 
mance  for  these  methods  are  at  probability  of  correct  classification  of  Pec, SVD  ~  95.1%,  Pcc,ksvd  ~ 
90.1%,  Pec, lp-ksvd  ~  91.3%,  and  probability  of  false  alarm  Pfa, SVD  ~  4.9%,  Pfa,ksvd  ~ 
9.9%,  Pfa, lp-ksvd  ~  8.7%,  respectively.  Here,  the  best  overall  results  for  the  TREX13  datasets 
were  obtained  for  a  classifier  trained  and  tested  using  data  from  the  SVD  dictionary  learning 
method. 

Figure  14a  shows  the  ROC  curves  for  a  three-aspect  MSC  classifier  on  the  30m  run  datasets  using 
the  SVD,  K-SVD,  and  LP-KSVD  dictionaries.  The  ROC  knee-point  performance  for  these  methods 
are  at  probability  of  correct  classification  Pec, SVD  ~  83 %,Pcc,ksvd  ~  86%,  Pec, lp-ksvd  ~ 

88%  and  probability  of  false  alarm  Pfa, SVD  ~  17%,  Pfa,ksvd  ~  14%,  Pfa, lp-ksvd  ~  12%, 
respectively.  Again,  these  results  attest  to  the  fact  that  our  three-aspect  MSC  classifier  exclusively 
trained  on  model  data  performs  very  well  in  discriminating  ‘UXO’  vs.  ‘non-UXO’  in  the  TREX13 
dataset.  Also,  as  in  the  PondEX  datasets  the  best  overall  results  for  the  TREX13  datasets  are 
obtained  for  a  classifier  trained  and  tested  using  data  from  LP-KSVD  dictionary  learning  method. 
Table  6  displays  the  confusion  matrix  for  the  MSC  at  the  decision  threshold  which  corresponds  to 
choosing  the  minimum  statistic  between  UXO  and  non-UXO  classes  in  (16). 

6  Conclusions  and  Implications  for  Future  Research/Implementation 

6.1  Conclusions  and  Discussions 

The  objectives  addressed  in  the  current  study  revolved  around  the  development  and  testing  of  UXO 
detection  and  classification  algorithms  from  sonar  data.  A  multichannel  detection  strategy  relying 
on  the  theory  of  the  GLRT  was  developed  using  the  broadband  coherence  method  in  [1],  For  this 
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ROC  for  Various  TestsAsp/obs  =  3  ROC  for  Various  TestsAsp/obs  =  3 


(a)  ROCs  for  10m  TREX13  dataset  (3  Asp/dec.)  (b)  ROCs  for  30m  TREX13  dataset  (3  Asp./dec.) 

Figure  14:  ROCs  for  TREX13  dataset 
Table  6:  Confusion  matrix  of  classified  signals  (LP-KSVD)  at  30m 


Truth-Decision 

‘non-UXO’ 

‘UXO’ 

alcyl3ft 

0.8116 

0.1884 

alpipe 

0.9339 

0.0661 

aluxo 

0.1582 

0.8418 

ssuxo 

0.5556 

0.4444 

how cap h2o 

0.0332 

0.9668 

how.nocap 

0.0983 

0.9017 

problem,  our  detection  hypothesis  was  that  the  presence  of  munitions  in  the  sonar  backscatter 
collected  from  a  hydrophone  array  over  two  distinct  frequency  bands  will  lead  to  higher  levels  of 
coherence  compared  to  the  backscatter  from  the  seafloor  alone.  This  increase  in  coherence  can 
give  one  an  indication  of  which  areas  of  the  target  field  may  contain  potential  munitions  that 
will  further  be  analyzed  by  a  UXO  vs.  non-UXO  classifier.  Relying  on  the  use  of  the  GLRT 
and  making  an  assumption  that  the  received  time  series  are  wide  sense  stationary,  the  amount 
of  coherent  information  among  the  channels  is  measured  using  the  broadband  coherence  statistic. 
This  multichannel  detector  was  then  applied  to  a  dataset  collected  in  a  pond  facility  designed  to 
collect  acoustical  sonar  data  from  underwater  objects  in  a  relatively  controlled  and  clutter-free 
environment.  Results  of  this  preliminary  experiment  show  that  the  broadband  coherence  statistic 
is  indeed  capable  of  detecting  munitions  lying  on  the  seafloor  from  background  as  the  ROC  curve 
of  the  detector  exhibited  a  100%  probability  of  detection  with  an  average  of  1  false  alarm  per 
image.  This  suggests  that  the  presence  of  munitions  in  the  sonar  backscatter  does  indeed  lead  to  a 
higher  level  of  coherence  and  that  the  proposed  methodology  would  be  capable  of  finding  regions 
containing  munitions  in  the  target  field.  This  detector  could  not  be  tested  on  the  raw  TREX13 
datasets  due  to  data  distribution  limitation. 

The  second  objective  considered  in  this  seed  research  was  the  use  of  scattering  models  from 
various  UXO  and  non-UXO  objects  for  training  purposes.  The  scattering  model  that  has  been  de¬ 
veloped  by  the  APL-UW  subcontractors  allowed  for  monostatic  SAS  data  sets  to  be  simulated  via 
a  fast  ray  model  that  combines  an  acoustic  ray  approximation  for  propagation  in  a  fluid-filled  halfs¬ 
pace  with  scattering  from  a  target  (UXO  or  non-UXO)  in  free-space.  This  fast  modeling  provides  a 
large  database  for  feature  extraction  and  classifier  training  during  the  classifier  development.  The 
hypothesis  tested  here  is  whether  or  not  the  classifier  trained  exclusively  on  model-generated  sonar 
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data  of  various  UXO-like  objects  provides  enough  generalization  ability  to  discriminate  munitions 
with  sufficient  accuracy  in  real  sonar  datasets.  This  is  critical  as  real  sonar  data  for  a  wide  range 
of  UXOs  and  in  various  operational  and  environmental  conditions  is  currently  unavailable.  Thus, 
the  development  of  classification  systems  that  are  originally  trained  on  model-based  data  would 
provide  us  with  clues  on  what  additional  information  extracted  during  the  actual  field  operation 
can  be  used  to  augment  the  training  for  optimal  performance. 

Finally,  the  third  task  in  this  study  was  the  development  of  a  MSC-based  UXO  vs.  non- 
UXO  classifier  using  several  dictionary  learning  methods.  The  model- generated  data  via  the  fast 
ray  model  provides  an  over-complete  dictionary  for  creation  of  signal  subspaces  used  in  the  MSC 
classifier.  In  this  work,  we  used  three  different  methods  namely  the  regular  SVD,  K-SVD  [3]  and 
LP-KSVD  methods  [4],  AC  data  was  used  in  conjunction  with  these  methods  to  produce  the 
training  and  testing  datasets  for  the  classifier.  Once  MSC  was  properly  trained  on  the  model¬ 
generated  data,  it  was  tested  on  real  PondEX09,  PondEXlO,  and  TREX13  datasets.  Our  binary 
classification  results  for  UXO  vs.  non-UXO  reveal  significant  performance  improvements  when 
compared  to  the  existing  results  [11],  [14],  What  is  particularly  notable  about  these  results  is  that 
the  utility  of  the  fast  ray  model  in  representing  real  sonar  data  for  classification  purposes  has  been 
confirmed.  Among  the  three  dictionary  learning  methods,  LP-KSVD  was  found  to  consistently 
provide  better  results  on  both  datasets.  For  the  PondEX  experiments,  the  knee  point  performance 
for  LP-KSVD  exhibits  Pec  ~  93.2%,  Pfa  ~  6.8%;  while  for  the  TREX13  experiment  at  the  30m 
range,  the  knee  point  performance  for  LP-KSVD  was  at  Pec  ~  87.9%,  Pfa  ~  12.79%.  Comparing 
the  two  datasets,  the  degradation  in  the  classification  performance  for  the  TREX13  datasets  is 
mostly  attributed  to  more  realistic  and  challenging  bottom  conditions  and  larger  number  of  objects 
with  similar  characteristics. 

6.2  Possible  Future  Research/Implementation 

This  seed  research  surfaced  many  critical  problems  that  can  be  pursued  in  future  research  oppor¬ 
tunities.  Below  is  a  list  of  only  a  few  items  that  we  believe  are  worthy  of  further  research. 

•  Task  1:  Broadband  Coherence  Detector  with  Adaptive  Thresholding: 

Although  the  broadband  coherence  detector  is  fully  capable  of  discriminating  time  series 
corresponding  to  UXO  objects  from  those  corresponding  to  seafloor  background,  one  of  the 
most  difficult  parts  of  implementing  it  in  any  realistic  scenario  will  be  determining  a  detection 
threshold  that  is  robustly  capable  of  achieving  a  desired  false  alarm  rate  over  the  wide  range  of 
environments  that  may  be  encountered.  To  achieve  this,  we  propose  to  adapt  this  parameter  of 
the  algorithm  as  data  is  collected  within  the  environment  by  fitting  a  parametric  distribution 
to  the  broadband  coherence  statistics  produced  from  that  data.  The  parameters  of  this 
distribution  may  then  be  recursively  updated  as  new  data  is  collected  to  give  the  detector 
the  ability  to  adapt  to  changing  statistical  behavior  in  the  likelihood  ratio.  Note  that  this 
unsupervised  approach  relies  on  an  assumption  that  it  is  very  unlikely  to  observe  a  UXO  object 
so  that  the  statistics  of  the  likelihood  ratio  are  a  good  indication  of  that  for  background  alone. 

•  Task  2:  Incremental  Training  for  Better  Robustness:  Our  results  presented  in  this 
report  clearly  indicate  the  promise  and  effectiveness  of  the  developed  methods  for  detection 
and  classification  of  munitions  from  sonar  data.  However,  it  is  obvious  that  the  spectral 
features  in  the  AC  data  for  a  specific  target  vary  significantly  depending  on  the  object’s 
burial  condition,  seafloor  properties  and  roughness,  actual  orientation  of  the  object,  range 
and  grazing  angles  with  respect  to  the  sonar,  etc.  Although,  it  is  unrealistic  to  expect  model 
data  will  capture  all  such  variations  for  target  characterization,  it  can  provide  us  with  clues 
on  how  to  augment  the  training  datasets  using  perhaps  a  limited  training  samples  from  sonar 
returns  of  actual  objects  to  improve  the  robustness  in  different  environmental  conditions. 
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The  importance  of  this  data  augmentation  beyond  the  Fast  Ray  Model  data  and  incremental 
training  using  real  data  to  improve  classification  performance  is  demonstrated  here.  Several 
PondEX  object  samples  (only  2  %  of  training  from  real  PondEX  test  data)  that  correspond 
to  the  same  objects  in  the  TREX13  dataset  were  added  to  the  model-generated  training 
set  to  form  the  dictionary  training  set  and  build  the  MSC  classifier.  The  trained  system 
was  then  tested  on  TREX  10m  run  datasets.  It  was  found  that  the  addition  of  real  Pond 
samples  from  the  relevant  objects  improved  the  overall  results  noticeably.  The  knee-point 
performances  of  the  SVD,  K-SVD,  and  LP-KSVD  for  this  incremental  training  test  were 
Pcc,SVD  ~  96.70%,  Pcc,ksvd  ~  90.43%,  Pcc, lp-ksvd  ~  94.1%  and  probability  of  false 
alarm  Pfa,svd  ~  3.30%,  Pfa,ksvd  ~  9.57%,  Pfa, lp-ksvd  ~  5.9%  respectively.  Clearly, 
we  didn’t  attempt  to  optimize  this  data  selection  process  based  upon  the  information  content 
of  the  selected  sample  for  classification  purposes.  Additionally,  we  had  to  retrain  (in  batch) 
all  the  dictionaries  and  consequently  the  MSC  classifier,  which  is  not  obviously  an  efficient 
way  of  doing  this  training.  Thus,  we  propose  to  accomplish  this  by  developing  new  methods 
for  optimal  sample  selection  and  incremental  training  that  can  update  the  parameters  of  the 
MSC  classifier  iteratively  in  order  to  learn  the  new  and  informative  samples  taken  in  the  new 
environment  while  guaranteeing  the  stability  of  the  previously  learned  samples. 


Figure  15:  Incremental  Learning  Result,  10m  TREX13  (3  Asp/dec.) 

•  Task  3:  Nonlinear  Extension  of  MSC: 

Figures  16  (a)  and  (b)  shows  the  decision  boundary  and  the  distribution  of  the  MSC’s  test 
statistics  for  UXO  and  Non-UXO  objects  in  the  TREX13  10m  runs  and  LP-KSVD  and  SVD 
dictionaries,  respectively.  As  can  be  seen,  there  are  several  objects  that  are  misclassified  for 
both  object  types.  One  of  the  reasons  for  this  imperfect  performance  has  to  do  with  the 
fact  that  the  MSC  is  a  linear  classifier  designed  based  upon  the  subspaces  associated  with 
the  UXO  and  Non-UXO  classes.  It  is  well-known  [23]  that  certain  non-linearly  separable 
classification  problems  can  easily  be  converted  to  linearly  separable  problems  by  mapping  the 
data  to  a  high  dimensional  feature  space  using  kernel-producing  nonlinear  mapping  functions. 
In  this  particular  application,  there  is  no  guarantee  that  under  all  conditions  the  solution  of 
the  optimization  problem  for  the  model  using  the  LP-KSVD  or  SVD  will  lead  to  features 
that  are  sparse  and  confined  to  the  subspace  associated  with  the  class  of  the  object.  In 
other  words,  the  resultant  features  could  “leak”  into  subspaces  associated  with  the  opposite 
classes,  hence  leading  to  possible  misclassihcations  and  false  alarm  as  can  be  seen  in  Figures 
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16.  To  overcome  such  issues,  we  propose  to  investigate  the  kernel-based  extension  of  the 
MSC  classification  framework.  It  is  also  interesting  to  investigate  what  types  of  kernels  can 
provide  the  best  results  for  the  corresponding  structured  learning.  We  propose  to  investigate 
this  issue  thoroughly  in  this  task. 

LP-KSVD  Decision  Plane  SVD  Decision  Plane 


(a)  LP-KSVD  dictionary. 


(b)  SVD  dictionary. 


Figure  16:  Decision  Boundary,  10m  TREX13  (3  Asp/dec.) 


•  Task  f:  Multi- A  sped  Classification  Fusion: 

In  a  real  operating  environment,  the  decision  about  the  presence  and  type  of  an  object  is 
typically  made  based  upon  the  observation  of  the  properties  of  the  sonar  returns  at  several 
aspect  angles  or  pings.  This  is  due  to  the  fact  that  the  multi-aspect  processing  yields  substan¬ 
tial  improvements  in  performance,  resolution,  and  sensing  of  the  3-D  properties  of  the  object 
in  a  non-isotropic  environment.  Inspired  by  these  desired  properties,  we  have  developed  two 
different  general  frameworks  [9],  [11],  [8],  [10],  [24]  for  performing  multiple-aspect  or  multiple 
ping  processing  of  sonar  data.  The  framework  in  [9],  [8]  is  based  upon  the  idea  of  multi-aspect 
feature  extraction  that  uses  the  two-channel  canonical  coordinate  decomposition  method  to 
extract  robust  features  with  maximum  coherence  (or  mutual  information)  from  pairs  of  sonar 
pings  with  certain  separation.  The  idea  is  that  the  coherence  pattern  extracted  from  the 
UXO  objects  differ  from  those  of  the  Non-UXO  objects,  hence  aiding  the  overall  classification 
process.  The  theme  of  the  other  framework  is  multi-aspect  classification  using  either  (a)  a 
decision- level  multi-aspect  fusion  [24],  which  linearly  or  non-linearly  combines  the  individual 
classification  decisions,  generated  at  several  aspects,  or  (b)  a  feature-level  multi-aspect  fusion 
[10]  using  HMM  to  generate  one  decision  based  upon  observing  a  sequence  of  AC  feature 
vectors  at  various  aspects  with  ceratin  separations,  or  (c)  a  collaborative  decision-making 
process  [11],  which  uses  sort  of  a  combination  of  the  feature- level  and  decision- level  fusion 
methods.  In  this  work,  we  propose  to  carefully  study  and  test  these  different  methods  and 
compare  their  results  with  the  multi-aspect  MSC  classier  in  (15).  Moreover,  we  intend  to 
revisit  those  weighting  matrix  methods  mentioned  in  the  original  MSC  method  [2]  which  can 
limit  the  influence  of  large  error  amplitudes  for  certain  measurements  or  de-emphasize  less 
important  AC  features  by  assigning  small  corresponding  weights. 

•  Task  5:  Comprehensive  Testing  &  Evaluation: 

We  shall  test  and  evaluate  the  performance  of  the  developed  detection  and  classification 
methods  developed  in  this  phase  of  research  on  several  real  (e.g.,  PondEX,  TREX13,  and 
BayX)  sonar  data  sets  collected  in  different  environmental  and  background  conditions.  The 
specific  issues  that  will  be  thoroughly  studied  include: 
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1 .  Design  and  test  the  modified  version  of  the  broadband  coherence  detector  using  the  pro¬ 
posed  adaptive  thresholding  in  many  different  environmental  and  operating  conditions. 

2.  Using  real  and  synthesized  sonar  data  determine  the  effectiveness  of  the  developed  adap¬ 
tive  dictionary  learning  and  sparse  representation  methods  in  different  situations.  We 
will  investigate  how  the  model  and  dictionaries  can  be  designed  to  guarantee  robustness 
to  partial  burial  and  occlusion  of  the  targets.  Use  these  dictionaries  to  build  matched 
subspace  classifiers  and  test  their  target  versus  non-target  classification  performance 
using  the  ROC  curve  and  confusion  matrix. 

3.  The  developed  kernel  matched  subspace  classifier  will  be  thoroughly  tested  on  the  same 
sonar  datasets.  The  classification  performance  of  the  kernel-based  MSC  will  be  bench- 
marked  against  the  linear  weighted  MSC  using  the  commonly  used  performance  metrics. 
In  addition,  robustness  to  the  environmental  variations  will  be  evaluated  on  both  Pon- 
dEX  and  TREX13  datasets. 

4.  Benchmark  the  performance  of  one  of  the  above-mentioned  multi-aspect  classifier  against 
the  matched  subspace  classifier  in  (15)  which  uses  multiple  AC  feature  vectors  in  the 
data  matrix. 

5.  Prepare  interim  and  final  reports  to  document  all  the  developments  and  results  of  this 
research  after  the  first  and  second  years.  We  plan  to  publish  our  results  in  various  IEEE 
Transactions.  Our  results,  publications,  reports,  and  algorithms  will  be  made  available, 
as  part  of  our  technology  transfer  plans. 
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